Using Distributed Query Result Caching to Evaluate Queries for Parallel Data Mining Algorithms

نویسندگان

Merwyn G. Taylor

James A. Hendler

Joel Saltz

چکیده

An increase in the speed of data mining algorithms can be achieved by improving the efciency of the underlying technologies. Query engines are key components in many knowledge discovery systems and the appropriate use of query engines can impact the performance of data mining algorithms. By taking advantage of hypothesis generation patterns, queries, generated from the hypotheses, can be evaluated more e ciently. Caching query results and using the cached results to evaluate new queries with similar constraints reduces the complexity of query evaluation and improves the performance of data mining algorithms. In a multi-processor environment, distributing the query result caches can improve the performance of parallel query evaluations. This idea has been used in the ParDRI system and has resulted in signi cant improvements in the execution times of ParDRI.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بهبود الگوریتم انتخاب دید در پایگاه داده‌‌ تحلیلی با استفاده از یافتن پرس‌ وجوهای پرتکرار

A data warehouse is a source for storing historical data to support decision making. Usually analytic queries take much time. To solve response time problem it should be materialized some views to answer all queries in minimum response time. There are many solutions for view selection problems. The most appropriate solution for view selection is materializing frequent queries. Previously posed ...

متن کامل

Multiple query scheduling for distributed semantic caches

In distributed query processing systems, load balancing plays an important role in maximizing system throughput. When queries can leverage cached intermediate results, improving the cache hit ratio becomes as important as load balancing in query scheduling, especially when dealing with computationally expensive queries. The scheduling policies must be designed to take into consideration the dyn...

متن کامل

Parallel Visual Information Retrieval in VizIR

This paper describes how parallel retrieval is implemented in the content-based visual information retrieval framework VizIR. Generally, two major use cases for parallelisation exist in visual retrieval systems: distributed querying and simultaneous multi-user querying. Distributed querying includes parallel query execution and querying multiple databases. Content-based querying is a two-step p...

متن کامل

Separating indexes from data: a distributed scheme for secure database outsourcing

Database outsourcing is an idea to eliminate the burden of database management from organizations. Since data is a critical asset of organizations, preserving its privacy from outside adversary and untrusted server should be warranted. In this paper, we present a distributed scheme based on storing shares of data on different servers and separating indexes from data on a distinct server. Shamir...

متن کامل

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Using Distributed Query Result Caching to Evaluate Queries for Parallel Data Mining Algorithms

نویسندگان

چکیده

منابع مشابه

بهبود الگوریتم انتخاب دید در پایگاه داده‌‌ تحلیلی با استفاده از یافتن پرس‌ وجوهای پرتکرار

Multiple query scheduling for distributed semantic caches

Parallel Visual Information Retrieval in VizIR

Separating indexes from data: a distributed scheme for secure database outsourcing

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

عنوان ژورنال:

اشتراک گذاری